Ranking large document collections by a state space search
نویسندگان
چکیده
منابع مشابه
Entropy-Based Authorship Search in Large Document Collections
The purpose of authorship search is to identify documents written by a particular author in large document collections. Standard search engines match documents to queries based on topic, and are not applicable to authorship search. In this paper we propose an approach to authorship search based on information theory. We propose relative entropy of style markers for ranking, inspired by the lang...
متن کاملEfficient Search in Document Image Collections
This paper presents an efficient indexing and retrieval scheme for searching in document image databases. In many non-European languages, optical character recognizers are not very accurate. Word spotting word image matching may instead be used to retrieve word images in response to a word image query. The approaches used for word spotting so far, dynamic timewarping and/or nearest neighbor sea...
متن کاملMediated access to very large document collections
Mediation based on structured specialised collections is proposed as an approach to supporting the exploration of very large document collections such as the Web. The main techniques employed are statistical language modelling and document clustering. The paper discusses the new interaction model and presents experimental results obtained with a prototypical system. References to papers which d...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Information Processing & Management
سال: 1991
ISSN: 0306-4573
DOI: 10.1016/0306-4573(91)90029-l